fix: auto-reject PROGRAM messages with non-dict metadata by odesenfans · Pull Request #1137 · aleph-im/pyaleph

odesenfans · 2026-05-14T23:12:30Z

Summary

Some PROGRAM messages slipped past validation while ExecutableContent.metadata accepted lists. The current validator requires a dict, so reading those rows fails parsed_content and surfaces as 500s on GET /api/v0/messages/<hash> (ex: 42a4a8...3d96f3 returns 500, while the same hash on epyc properly reports the message as rejected).

This change:

Adds mark_processed_message_as_rejected in aleph.repair. It mirrors mark_pending_message_as_rejected but starts from a MessageDb row instead of a PendingMessageDb: cleans up VM rows for program/instance, upserts rejected_messages, flips message_status to REJECTED, and deletes the messages row. The trigger keeps message_counts consistent; FK cascades clean message_confirmations and account_costs.
Adds _reject_invalid_program_metadata and wires it into repair_node so the API rejects affected PROGRAM messages on every startup. The query uses jsonb_typeof(content->'metadata') = 'array'; an empty result is a no-op.
Ships deployment/scripts/reject_processed_messages.py for ad-hoc cleanups when a restart is not an option. Dry-run by default, --commit to persist; targets specific hashes via --hash / --hashes-file. Runs from inside the API container against the deployed config at /var/pyaleph/config.yml.

Test plan

venv/bin/python -m pytest tests/test_repair.py -v — 5 tests, all pass (rejects list metadata, preserves dict/None metadata, ignores non-program types, no-op on empty DB).
venv/bin/python -m pytest tests/db/test_messages.py tests/db/test_credit_balances.py — adjacent suites still pass (63 total).
venv/bin/ruff check + black + isort clean on changed files.
Manual: on a staging snapshot, confirm targeted hashes flip from PROCESSED to REJECTED and GET /messages/<hash> no longer 500s.

🤖 Generated with Claude Code

Some PROGRAM messages slipped past validation while ExecutableContent.metadata accepted lists. The current validator requires a dict, so reading those rows fails parsed_content and surfaces as 500s on GET /messages/<hash>. Move them to REJECTED at startup so the API renders them like nodes that rejected them in the first place. The transition logic also lives behind a deployment/scripts helper for ad-hoc cleanups when waiting for a restart is not an option. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

foxpatch-aleph

Clean, well-structured fix for a production bug where PROGRAM messages with list-typed metadata cause 500s via parsed_content. Implements a reusable rejection utility for processed messages, wires a repair function into startup, and ships a companion CLI script. Thorough test coverage and good code quality throughout.

src/aleph/repair.py (line 69): Consider using session.execute(delete_vm_updates(...)) instead of _ = list(...) to avoid loading results into memory and make the intent clearer. The list() is needed to force execution, but a comment explaining why would help maintainers.

mypy on CI flagged the str args. Pick a hex-valid prefix for the good sample hash too so ItemHash() does not reject it at runtime. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

foxpatch-aleph

A well-structured fix for PROGRAM messages with invalid list-typed metadata that cause 500 errors. The approach is correct: repair_node rejects them at startup, a standalone script handles ad-hoc cases, and both properly clean up VM rows and cascade to account_costs via FK. The race condition between the initial query and per-hash processing is properly handled with a re-check. Tests cover the main scenarios (list/dict/null metadata, non-PROGRAM types, empty DB). No bugs or security issues found.

deployment/scripts/reject_processed_messages.py (line 256): Minor: changed is incremented in both the --commit path and the dry-run path, so the summary count is not strictly 'changed' in the commit sense. Consider using two separate counters or mentioning 'processed' in the count label.

foxpatch-aleph previously approved these changes May 14, 2026

View reviewed changes

test: wrap item_hash in ItemHash for get_message_status calls

9e272e2

mypy on CI flagged the str args. Pick a hex-valid prefix for the good sample hash too so ItemHash() does not reject it at runtime. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

odesenfans dismissed foxpatch-aleph’s stale review via 9e272e2 May 18, 2026 08:50

foxpatch-aleph approved these changes May 18, 2026

View reviewed changes

odesenfans merged commit 1ca5516 into main May 18, 2026
4 checks passed

odesenfans deleted the fix/reject-program-invalid-metadata branch May 18, 2026 09:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: auto-reject PROGRAM messages with non-dict metadata#1137

fix: auto-reject PROGRAM messages with non-dict metadata#1137
odesenfans merged 2 commits into
mainfrom
fix/reject-program-invalid-metadata

odesenfans commented May 14, 2026

Uh oh!

foxpatch-aleph left a comment

Uh oh!

foxpatch-aleph left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

odesenfans commented May 14, 2026

Summary

Test plan

Uh oh!

foxpatch-aleph left a comment

Choose a reason for hiding this comment

Uh oh!

foxpatch-aleph left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants